Last updated: 2025-04-16

Checks: 5 2

Knit directory: PIPAC_spatial/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown is untracked by Git. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20240917) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute relative
/home/hnatri/PIPAC_spatial/ .
/home/hnatri/PIPAC_spatial/code/PIPAC_colors_themes.R code/PIPAC_colors_themes.R
/home/hnatri/PIPAC_spatial/code/plot_functions.R code/plot_functions.R
/home/hnatri/PIPAC_spatial/analysis/hierarchical_clustering.Rmd analysis/hierarchical_clustering.Rmd

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 281c5e5. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .RData
    Ignored:    analysis/figure/
    Ignored:    celltype_markers.tsv
    Ignored:    immune_cluster_marker_annotations.tsv
    Ignored:    immune_cluster_marker_annotations_2ndpass.tsv
    Ignored:    main_cluster_marker_annotations.tsv
    Ignored:    nonimmune_cluster_marker_annotations.tsv
    Ignored:    nonimmune_cluster_marker_annotations_2ndpass.tsv

Untracked files:
    Untracked:  MR_PIPACTMA3-Rerun_highres.png
    Untracked:  Rplots.pdf
    Untracked:  analysis/arm2_comparative_analysis.Rmd
    Untracked:  analysis/arm3_DEGs.Rmd
    Untracked:  analysis/arm3_comparative_analysis.Rmd
    Untracked:  analysis/arm3_niche_comparison.Rmd
    Untracked:  analysis/feature_expression.Rmd
    Untracked:  analysis/hierarchical_clustering.Rmd
    Untracked:  analysis/niche_construction_n20.Rmd
    Untracked:  analysis/pca_variance_decomp.Rmd
    Untracked:  annotation_dimplot.pdf
    Untracked:  code/construct_niches.R
    Untracked:  code/plot_metadata.R
    Untracked:  code/plot_save_pdf.R
    Untracked:  code/run_rscript.sh
    Untracked:  code/transcript_contamination.R
    Untracked:  code/update_metadata.R
    Untracked:  code/vardecomp.R
    Untracked:  coh004.pdf
    Untracked:  demographics_grid.pdf
    Untracked:  output/outputFile-Meanplot.pdf
    Untracked:  output/outputFile-Variance.csv
    Untracked:  output/outputFile-VarianceExplained-Boxplot.pdf
    Untracked:  output/scrna-Meanplot.pdf
    Untracked:  output/scrna-Variance.csv
    Untracked:  output/scrna-VarianceExplained-Boxplot.pdf
    Untracked:  output/vardecomp.tsv
    Untracked:  tissue_celltypeprop_scatter.pdf

Unstaged changes:
    Modified:   analysis/add_metadata.Rmd
    Modified:   analysis/annotation_2nd_pass.Rmd
    Modified:   analysis/index.Rmd
    Modified:   analysis/niche_construction.Rmd
    Modified:   analysis/plot_by_group.Rmd
    Modified:   analysis/splitting_samples.Rmd
    Modified:   code/PIPAC_colors_themes.R

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


There are no past versions. Publish this analysis with wflow_publish() to start tracking its development.


Packages and environment variables

suppressPackageStartupMessages({
  library(workflowr)
  library(arrow)
  library(Seurat)
  library(SeuratObject)
  library(SeuratDisk)
  library(tidyverse)
  library(tibble)
  library(ggplot2)
  library(ggpubr)
  library(ggrepel)
  library(googlesheets4)
  library(workflowr)
  library(dittoSeq)})

Environment variables and helper functions

setwd("/home/hnatri/PIPAC_spatial/")
set.seed(9999)
options(scipen = 99999)
options(ggrepel.max.overlaps = Inf)

source("/home/hnatri/PIPAC_spatial/code/PIPAC_colors_themes.R")
source("/home/hnatri/PIPAC_spatial/code/plot_functions.R")

Import data

seurat_data <- readRDS("/tgen_labs/banovich/PIPAC/Seurat/PIPAC_NC50_NN20_PC20_Seurat_annotated_metadata.rds")
seurat_data$Tissue <- as.character(seurat_data$Tissue)
seurat_data$Sample <- as.character(seurat_data$Sample)
seurat_data$Timepoint <- as.character(seurat_data$Timepoint)
seurat_data$Location_Quadrant <- as.character(seurat_data$Location_Quadrant)

DotPlot heatmap

Idents(seurat_data) <- seurat_data$Annotation
markers <- FindAllMarkers(seurat_data,
                          return.thresh = 0.01,
                          logfc.threshold = 0.5,
                          min.pct = 0.20,
                          only.pos = T,
                          verbose = F)

table(markers$cluster)

EpiTumor1      Meso        M9        F1        F3        L2      Mast        L4 
       72        49        77        22        38        60        21        16 
       F5     Endo1        L3        M1        M5        L1        M2        M7 
        0        44        65        18        70        41        78        72 
  Plasma1   Plasma3        F2     Endo3      Peri        F4        B1        L5 
       33        28        30        39        27        25        43        92 
  Plasma2        M8     Endo2        M4        M6        B2        DC        M3 
        9        85        47        22        30        53        11        28 
EpiTumor2 
       89 
top_markers <- markers %>%
  arrange(dplyr::desc(abs(avg_log2FC))) %>%
  group_by(cluster) %>%
  dplyr::slice(1:20)

create_dotplot_heatmap(seurat_object = seurat_data,
                       plot_features = unique(top_markers$gene),
                       group_var = "Annotation",
                       group_colors = pipac_celltype_col,
                       column_title = "",
                       row_km = 5,
                       col_km = 5,
                       row.order = NULL,
                       col.order = NULL)

Cluster cells

seurat_data_downsample <- subset(seurat_data, downsample = 30000/length(unique(seurat_data$Annotation)))

seurat_data_downsample <- ScaleData(seurat_data_downsample)

table(seurat_data_downsample$Annotation)

       B1        B2        DC     Endo1     Endo2     Endo3 EpiTumor1 EpiTumor2 
      909       909       909       909       909       909       909       909 
       F1        F2        F3        F4        F5        L1        L2        L3 
      909       909       909       909       909       909       909       909 
       L4        L5        M1        M2        M3        M4        M5        M6 
      909       303       909       909        18       909       909       909 
       M7        M8        M9      Mast      Meso      Peri   Plasma1   Plasma2 
      909       909       909       909       909       909       909       909 
  Plasma3 
      909 
DoHeatmap(seurat_data_downsample,
          features = unique(top_markers$gene),
          group.by = "Annotation",
          group.colors = pipac_celltype_col)

# To build on command line, run Rscript -e "rmarkdown::render('/home/hnatri/PIPAC_spatial/analysis/hierarchical_clustering.Rmd')"
# Then "mv *.html /home/hnatri/PIPAC_spatial/docs/"
# Gene expression data from the RNA assay
heatmap_data <- GetAssayData(seurat_data_downsample, slot = "scale.data", assay = "RNA")
dim(heatmap_data)
[1]   480 28500
# Order by cluster, then by response
cellpop_cells_metadata <- data.frame("cell" = rownames(seurat_data_downsample@meta.data),
                                     "celltype" = seurat_data_downsample@meta.data$Annotation,
                                     "tissue" = seurat_data_downsample@meta.data$Tissue,
                                     "sample" = seurat_data_downsample@meta.data$Sample)

# Colum annotations
col <- list()
col$celltype <- pipac_celltype_col
col_ha <- HeatmapAnnotation(
  df = data.frame(celltype = cellpop_cells_metadata$celltype),
  annotation_height = unit(4, "mm"),
  col = col
)

# Quantile-normalizing rows for plotting
heatmap_data_qqnorm <- t(apply(heatmap_data, 1, function(xx){qqnorm(rank(xx, ties.method = "random"), plot = F)$x}))
colnames(heatmap_data_qqnorm) <- colnames(heatmap_data)
rownames(heatmap_data_qqnorm) <- rownames(heatmap_data)

# Heatmap colors
#col_fun2 <- colorRamp2(quantile(as.matrix(heatmap_data_qqnorm), seq(0, 1, by = 0.25)), viridis(5))

plot_func <- function(){
  hm_rna  <- Heatmap(as.matrix(heatmap_data_qqnorm), 
                     name = "Gene exp", # Title of legend
                     #col = col_fun2,
                     col = viridis(100),
                     use_raster = T,
                     #column_title = "Cells", row_title = "Gene expression",
                     column_title = NULL,
                     row_title = NULL,
                     #column_split = cellpop_cells_metadata$cluster,
                     #row_split = all_markers_rna_sct$Marker_type,
                     #use_raster = TRUE,
                     show_column_names = FALSE,
                     show_row_names = FALSE,
                     cluster_rows = TRUE,
                     row_km = 10,
                     cluster_columns = TRUE,
                     column_km = 10,
                     top_annotation = col_ha,
                     #right_annotation = row_ha_labels,
                     height = nrow(heatmap_data)*unit(1, "mm"))
                     #bottom_annotation = bottom_ha,
                     #row_names_gp = gpar(fontsize = 7),  # Text size for row names
                     #row_names_side = "right")
      
  heatmap <- draw(hm_rna)
}

p <- plot_func()

# Extracting clusters
cl_list <- column_order(p)

for (i in 1:length(column_order(p))){
  if (i == 1) {
  clu <- t(t(colnames(heatmap_data_qqnorm[,column_order(p)[[i]]])))
  out <- cbind(clu, paste("cluster", i, sep=""))
  colnames(out) <- c("ID", "Cluster")
  } else {
  clu <- t(t(colnames(heatmap_data_qqnorm[,column_order(p)[[i]]])))
  clu <- cbind(clu, paste("cluster", i, sep=""))
  out <- rbind(out, clu)
  }
}

write.csv(out, "/scratch/hnatri/PIPAC_clustering_cols.csv")

out <- read.csv("/scratch/hnatri/PIPAC_clustering_cols.csv")



out$celltype <- mapvalues(x = out$ID,
                          from = colnames(seurat_data),
                          to = seurat_data$Annotation)

unique(out$Cluster)
 [1] "cluster1"  "cluster2"  "cluster3"  "cluster4"  "cluster5"  "cluster6" 
 [7] "cluster7"  "cluster8"  "cluster9"  "cluster10"
unique(out$celltype)
 [1] "Endo3"     "B1"        "L2"        "Plasma2"   "Plasma1"   "L3"       
 [7] "F5"        "M4"        "Plasma3"   "Mast"      "L1"        "M5"       
[13] "B2"        "Peri"      "Endo2"     "L4"        "M7"        "EpiTumor1"
[19] "DC"        "M8"        "M1"        "M6"        "L5"        "Endo1"    
[25] "M2"        "M9"        "M3"        "F4"        "F2"        "F3"       
[31] "F1"        "Meso"      "EpiTumor2"
out %>% filter(Cluster == "cluster1") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
       B1        B2        DC     Endo2     Endo3 EpiTumor1        F5        L1 
       34         9         7         2         4         2        16         9 
       L2        L3        L4        M1        M4        M5        M7        M8 
       22         7        15         2         2         3         5         1 
     Mast      Peri   Plasma1   Plasma2   Plasma3 
        7         2       898       490       102 
out %>% filter(Cluster == "cluster2") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
       B1        B2        DC     Endo1     Endo3 EpiTumor1        F5        L1 
      683       810         8         1         1         2         9         5 
       L2        L3        L4        L5        M1        M4        M5        M6 
       14         3        29         1         8         6         2         2 
       M7        M8   Plasma1   Plasma2 
       78        22         5        10 
out %>% filter(Cluster == "cluster3") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
       B1        B2        DC     Endo1     Endo2     Endo3 EpiTumor1 EpiTumor2 
       12        18        14         2        15        14         8         1 
       F1        F2        F3        F4        F5        L1        L2        L3 
        1         4         3         4       116        33         5        13 
       L4        L5        M1        M2        M3        M4        M5        M6 
       36         1       719       291        18        98       426       586 
       M7        M8        M9      Mast      Meso      Peri   Plasma2   Plasma3 
      411       441       148         8         5         3        30         5 
out %>% filter(Cluster == "cluster4") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
       B1        B2        DC     Endo1     Endo2     Endo3 EpiTumor1        F1 
      131        43       243        36        60       555       199        52 
       F3        F4        F5        L1        L2        L3        L4        L5 
        1        20       526        89         7        30       295         2 
       M1        M2        M4        M5        M6        M7        M8      Mast 
      169         4       742         3        12        40        25       843 
     Meso      Peri   Plasma1   Plasma2   Plasma3 
       41        50         1       135         4 
out %>% filter(Cluster == "cluster5") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
     B1      B2      DC   Endo1   Endo2   Endo3      F5      L1      L2      L3 
     20       8      10       1       2       1      10     695     781     824 
     L4      L5      M2      M4      M6      M7      M8    Mast Plasma1 Plasma2 
    489     295       1       4       1      10       3       1       1       3 
Plasma3 
      2 
out %>% filter(Cluster == "cluster6") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
       B1        B2        DC     Endo1     Endo2     Endo3 EpiTumor1        F1 
       14         6       341       846       779       204         7        19 
       F2        F3        F4        F5        L1        L2        L3        L4 
       10         5        10        36         7         7        12        17 
       M1        M2        M4        M6        M7        M8      Mast      Meso 
        2         1        28        17         9         4        12         9 
     Peri   Plasma2   Plasma3 
      692        10        10 
out %>% filter(Cluster == "cluster7") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
       B1        B2        DC     Endo1     Endo2     Endo3 EpiTumor1        F1 
       13         9       276        19        29        20         5       826 
       F2        F3        F4        F5        L1        L2        L3        L4 
      837       343       827       188        65        56        14        28 
       L5        M1        M4        M6        M7        M8      Mast      Meso 
        2         6        24       199         7         2        28        79 
     Peri   Plasma1   Plasma2   Plasma3 
      139         1       231       618 
out %>% filter(Cluster == "cluster8") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
       B1        B2        DC     Endo1     Endo2     Endo3 EpiTumor1        F1 
        1         4        10         2        20       110         2        11 
       F2        F3        F4        F5        L1        L2        L3        M6 
       58       557        48         8         2        15         4        29 
       M7        M8        M9      Mast      Meso      Peri   Plasma1   Plasma3 
        3         4         1         8       775        23         2       168 
out %>% filter(Cluster == "cluster9") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
     B1      B2   Endo1      L1      L2      L3      L5      M1      M2      M4 
      1       2       2       4       2       2       2       3     612       5 
     M5      M6      M7      M8      M9    Mast Plasma1 
    475      63     346     407     760       2       1 
out %>% filter(Cluster == "cluster10") %>%
  dplyr::select(celltype) %>%
  unlist() %>%
  table()
.
    Endo2 EpiTumor1 EpiTumor2 
        2       684       908 

sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] ComplexHeatmap_2.18.0 viridis_0.6.3         viridisLite_0.4.2    
 [4] circlize_0.4.15       plyr_1.8.8            RColorBrewer_1.1-3   
 [7] dittoSeq_1.14.3       googlesheets4_1.1.0   ggrepel_0.9.3        
[10] ggpubr_0.6.0          lubridate_1.9.2       forcats_1.0.0        
[13] stringr_1.5.0         dplyr_1.1.2           purrr_1.0.1          
[16] readr_2.1.4           tidyr_1.3.0           tibble_3.2.1         
[19] ggplot2_3.4.2         tidyverse_2.0.0       SeuratDisk_0.0.0.9021
[22] Seurat_5.0.1          SeuratObject_5.0.1    sp_1.6-1             
[25] arrow_12.0.0          workflowr_1.7.1      

loaded via a namespace (and not attached):
  [1] fs_1.6.2                    matrixStats_1.0.0          
  [3] spatstat.sparse_3.0-1       bitops_1.0-7               
  [5] httr_1.4.6                  doParallel_1.0.17          
  [7] tools_4.3.0                 sctransform_0.4.1          
  [9] backports_1.4.1             utf8_1.2.3                 
 [11] R6_2.5.1                    lazyeval_0.2.2             
 [13] uwot_0.1.14                 GetoptLong_1.0.5           
 [15] withr_2.5.0                 gridExtra_2.3              
 [17] progressr_0.13.0            cli_3.6.1                  
 [19] Biobase_2.62.0              Cairo_1.6-0                
 [21] spatstat.explore_3.2-1      fastDummies_1.7.3          
 [23] labeling_0.4.2              sass_0.4.6                 
 [25] spatstat.data_3.0-1         ggridges_0.5.4             
 [27] pbapply_1.7-0               parallelly_1.36.0          
 [29] limma_3.58.1                rstudioapi_0.14            
 [31] generics_0.1.3              shape_1.4.6                
 [33] ica_1.0-3                   spatstat.random_3.1-5      
 [35] car_3.1-2                   Matrix_1.6-5               
 [37] fansi_1.0.4                 S4Vectors_0.40.2           
 [39] abind_1.4-5                 lifecycle_1.0.3            
 [41] whisker_0.4.1               yaml_2.3.7                 
 [43] carData_3.0-5               SummarizedExperiment_1.32.0
 [45] SparseArray_1.2.3           Rtsne_0.16                 
 [47] promises_1.2.0.1            crayon_1.5.2               
 [49] miniUI_0.1.1.1              lattice_0.21-8             
 [51] cowplot_1.1.1               magick_2.7.4               
 [53] pillar_1.9.0                knitr_1.43                 
 [55] GenomicRanges_1.54.1        rjson_0.2.21               
 [57] future.apply_1.11.0         codetools_0.2-19           
 [59] leiden_0.4.3                glue_1.6.2                 
 [61] getPass_0.2-4               data.table_1.14.8          
 [63] vctrs_0.6.2                 png_0.1-8                  
 [65] spam_2.9-1                  cellranger_1.1.0           
 [67] gtable_0.3.3                assertthat_0.2.1           
 [69] cachem_1.0.8                xfun_0.39                  
 [71] S4Arrays_1.2.0              mime_0.12                  
 [73] survival_3.5-5              gargle_1.4.0               
 [75] SingleCellExperiment_1.24.0 pheatmap_1.0.12            
 [77] iterators_1.0.14            statmod_1.5.0              
 [79] ellipsis_0.3.2              fitdistrplus_1.1-11        
 [81] ROCR_1.0-11                 nlme_3.1-162               
 [83] bit64_4.0.5                 RcppAnnoy_0.0.20           
 [85] GenomeInfoDb_1.38.5         rprojroot_2.0.3            
 [87] bslib_0.4.2                 irlba_2.3.5.1              
 [89] KernSmooth_2.23-21          colorspace_2.1-0           
 [91] BiocGenerics_0.48.1         tidyselect_1.2.0           
 [93] processx_3.8.1              bit_4.0.5                  
 [95] compiler_4.3.0              curl_5.0.0                 
 [97] git2r_0.32.0                hdf5r_1.3.8                
 [99] DelayedArray_0.28.0         plotly_4.10.2              
[101] scales_1.2.1                lmtest_0.9-40              
[103] callr_3.7.3                 digest_0.6.31              
[105] goftest_1.2-3               spatstat.utils_3.0-3       
[107] presto_1.0.0                rmarkdown_2.22             
[109] XVector_0.42.0              htmltools_0.5.5            
[111] pkgconfig_2.0.3             MatrixGenerics_1.14.0      
[113] highr_0.10                  fastmap_1.1.1              
[115] rlang_1.1.1                 GlobalOptions_0.1.2        
[117] htmlwidgets_1.6.2           shiny_1.7.4                
[119] farver_2.1.1                jquerylib_0.1.4            
[121] zoo_1.8-12                  jsonlite_1.8.5             
[123] RCurl_1.98-1.12             magrittr_2.0.3             
[125] GenomeInfoDbData_1.2.11     dotCall64_1.0-2            
[127] patchwork_1.1.2             munsell_0.5.0              
[129] Rcpp_1.0.10                 reticulate_1.29            
[131] stringi_1.7.12              zlibbioc_1.48.0            
[133] MASS_7.3-60                 parallel_4.3.0             
[135] listenv_0.9.0               deldir_1.0-9               
[137] splines_4.3.0               tensor_1.5                 
[139] hms_1.1.3                   ps_1.7.5                   
[141] igraph_1.4.3                spatstat.geom_3.2-1        
[143] ggsignif_0.6.4              RcppHNSW_0.5.0             
[145] reshape2_1.4.4              stats4_4.3.0               
[147] evaluate_0.21               tzdb_0.4.0                 
[149] foreach_1.5.2               httpuv_1.6.11              
[151] RANN_2.6.1                  polyclip_1.10-4            
[153] future_1.32.0               clue_0.3-64                
[155] scattermore_1.2             broom_1.0.4                
[157] xtable_1.8-4                RSpectra_0.16-1            
[159] rstatix_0.7.2               later_1.3.1                
[161] googledrive_2.1.0           IRanges_2.36.0             
[163] cluster_2.1.4               timechange_0.2.0           
[165] globals_0.16.2